Darwin
EmoBang: Detecting Emotion From Bengali Texts
Maruf, Abdullah Al, Golder, Aditi, Jiyad, Zakaria Masud, Numan, Abdullah Al, Zaman, Tarannum Shaila
Emotion detection from text seeks to identify an individual's emotional or mental state - positive, negative, or neutral - based on linguistic cues. While significant progress has been made for English and other high-resource languages, Bengali remains underexplored despite being the world's fourth most spoken language. The lack of large, standardized datasets classifies Bengali as a low-resource language for emotion detection. Existing studies mainly employ classical machine learning models with traditional feature engineering, yielding limited performance. In this paper, we introduce a new Bengali emotion dataset annotated across eight emotion categories and propose two models for automatic emotion detection: (i) a hybrid Convolutional Recurrent Neural Network (CRNN) model (EmoBangHybrid) and (ii) an AdaBoost-Bidirectional Encoder Representations from Transformers (BERT) ensemble model (EmoBangEnsemble). Additionally, we evaluate six baseline models with five feature engineering techniques and assess zero-shot and few-shot large language models (LLMs) on the dataset. To the best of our knowledge, this is the first comprehensive benchmark for Bengali emotion detection. Experimental results show that EmoBangH and EmoBangE achieve accuracies of 92.86% and 93.69%, respectively, outperforming existing methods and establishing strong baselines for future research.
- North America > United States > Maryland (0.04)
- Oceania > Australia > Northern Territory > Darwin (0.04)
- North America > United States > North Carolina > Forsyth County > Winston-Salem (0.04)
- (3 more...)
Position: We Need Responsible, Application-Driven (RAD) AI Research
Hartman, Sarah, Ong, Cheng Soon, Powles, Julia, Kuhnert, Petra
This position paper argues that achieving meaningful scientific and societal advances with artificial intelligence (AI) requires a responsible, application-driven approach (RAD) to AI research. As AI is increasingly integrated into society, AI researchers must engage with the specific contexts where AI is being applied. This includes being responsive to ethical and legal considerations, technical and societal constraints, and public discourse. We present the case for RAD-AI to drive research through a three-staged approach: (1) building transdisciplinary teams and people-centred studies; (2) addressing context-specific methods, ethical commitments, assumptions, and metrics; and (3) testing and sustaining efficacy through staged testbeds and a community of practice. We present a vision for the future of application-driven AI research to unlock new value through technically feasible methods that are adaptive to the contextual needs and values of the communities they ultimately serve.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Oceania > Australia > Western Australia (0.04)
- Oceania > Australia > Northern Territory > Darwin (0.04)
- (6 more...)
- Law (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- Food & Agriculture > Agriculture (0.93)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Applied AI (0.93)
Revealing the Self: Brainwave-Based Human Trait Identification
Islam, Md Mirajul, Uddin, Md Nahiyan, Hasana, Maoyejatun, Pandit, Debojit, Rahman, Nafis Mahmud, Chellappan, Sriram, Azam, Sami, Islam, A. B. M. Alim Al
People exhibit unique emotional responses. In the same scenario, the emotional reactions of two individuals can be either similar or vastly different. For instance, consider one person's reaction to an invitation to smoke versus another person's response to a query about their sleep quality. The identification of these individual traits through the observation of common physical parameters opens the door to a wide range of applications, including psychological analysis, criminology, disease prediction, addiction control, and more. While there has been previous research in the fields of psychometrics, inertial sensors, computer vision, and audio analysis, this paper introduces a novel technique for identifying human traits in real time using brainwave data. To achieve this, we begin with an extensive study of brainwave data collected from 80 participants using a portable EEG headset. We also conduct a statistical analysis of the collected data utilizing box plots. Our analysis uncovers several new insights, leading us to a groundbreaking unified approach for identifying diverse human traits by leveraging machine learning techniques on EEG data. Our analysis demonstrates that this proposed solution achieves high accuracy. Moreover, we explore two deep-learning models to compare the performance of our solution. Consequently, we have developed an integrated, real-time trait identification solution using EEG data, based on the insights from our analysis. To validate our approach, we conducted a rigorous user evaluation with an additional 20 participants. The outcomes of this evaluation illustrate both high accuracy and favorable user ratings, emphasizing the robust potential of our proposed method to serve as a versatile solution for human trait identification.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
- Oceania > Australia > Northern Territory > Darwin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.54)
Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction
Guo, Yiqing, Mokany, Karel, Levick, Shaun R., Yang, Jinyan, Moghadam, Peyman
Earth observation data have shown promise in predicting species richness of vascular plants ($\alpha$-diversity), but extending this approach to large spatial scales is challenging because geographically distant regions may exhibit different compositions of plant species ($\beta$-diversity), resulting in a location-dependent relationship between richness and spectral measurements. In order to handle such geolocation dependency, we propose Spatioformer, where a novel geolocation encoder is coupled with the transformer model to encode geolocation context into remote sensing imagery. The Spatioformer model compares favourably to state-of-the-art models in richness predictions on a large-scale ground-truth richness dataset (HAVPlot) that consists of 68,170 in-situ richness samples covering diverse landscapes across Australia. The results demonstrate that geolocational information is advantageous in predicting species richness from satellite observations over large spatial scales. With Spatioformer, plant species richness maps over Australia are compiled from Landsat archive for the years from 2015 to 2023. The richness maps produced in this study reveal the spatiotemporal dynamics of plant species richness in Australia, providing supporting evidence to inform effective planning and policy development for plant diversity conservation. Regions of high richness prediction uncertainties are identified, highlighting the need for future in-situ surveys to be conducted in these areas to enhance the prediction accuracy.
- North America > United States (0.28)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Oceania > Australia > Australian Capital Territory > Canberra (0.05)
- (10 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Detection of Animal Movement from Weather Radar using Self-Supervised Learning
Haque, Mubin Ul, Dabrowski, Joel Janek, Rogers, Rebecca M., Parry, Hazel
Detecting flying animals (e.g., birds, bats, and insects) using weather radar helps gain insights into animal movement and migration patterns, aids in management efforts (such as biosecurity) and enhances our understanding of the ecosystem.The conventional approach to detecting animals in weather radar involves thresholding: defining and applying thresholds for the radar variables, based on expert opinion. More recently, Deep Learning approaches have been shown to provide improved performance in detection. However, obtaining sufficient labelled weather radar data for flying animals to build learning-based models is time-consuming and labor-intensive. To address the challenge of data labelling, we propose a self-supervised learning method for detecting animal movement. In our proposed method, we pre-train our model on a large dataset with noisy labels produced by a threshold approach. The key advantage is that the pre-trained dataset size is limited only by the number of radar images available. We then fine-tune the model on a small human-labelled dataset. Our experiments on Australian weather radar data for waterbird segmentation show that the proposed method outperforms the current state-of-the art approach by 43.53% in the dice co-efficient statistic.
- Oceania > Australia > Northern Territory > Darwin (0.14)
- North America > United States (0.14)
- Oceania > Australia > Queensland (0.04)
- (4 more...)
- Research Report (0.84)
- Overview > Innovation (0.34)
- Health & Medicine > Therapeutic Area (0.54)
- Energy > Renewable (0.46)
LocalValueBench: A Collaboratively Built and Extensible Benchmark for Evaluating Localized Value Alignment and Ethical Safety in Large Language Models
Meadows, Gwenyth Isobel, Lau, Nicholas Wai Long, Susanto, Eva Adelina, Yu, Chi Lok, Paul, Aditya
The proliferation of large language models (LLMs) requires robust evaluation of their alignment with local values and ethical standards, especially as existing benchmarks often reflect the cultural, legal, and ideological values of their creators. \textsc{LocalValueBench}, introduced in this paper, is an extensible benchmark designed to assess LLMs' adherence to Australian values, and provides a framework for regulators worldwide to develop their own LLM benchmarks for local value alignment. Employing a novel typology for ethical reasoning and an interrogation approach, we curated comprehensive questions and utilized prompt engineering strategies to probe LLMs' value alignment. Our evaluation criteria quantified deviations from local values, ensuring a rigorous assessment process. Comparative analysis of three commercial LLMs by USA vendors revealed significant insights into their effectiveness and limitations, demonstrating the critical importance of value alignment. This study offers valuable tools and methodologies for regulators to create tailored benchmarks, highlighting avenues for future research to enhance ethical AI development.
- North America > United States (0.25)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Oceania > Australia > Queensland > Brisbane (0.04)
- (2 more...)
- Law (0.92)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.48)
AgentPeerTalk: Empowering Students through Agentic-AI-Driven Discernment of Bullying and Joking in Peer Interactions in Schools
Paul, Aditya, Yu, Chi Lok, Susanto, Eva Adelina, Lau, Nicholas Wai Long, Meadows, Gwenyth Isobel
Addressing school bullying effectively and promptly is crucial for the mental health of students. This study examined the potential of large language models (LLMs) to empower students by discerning between bullying and joking in school peer interactions. We employed ChatGPT-4, Gemini 1.5 Pro, and Claude 3 Opus, evaluating their effectiveness through human review. Our results revealed that not all LLMs were suitable for an agentic approach, with ChatGPT-4 showing the most promise. We observed variations in LLM outputs, possibly influenced by political overcorrectness, context window limitations, and pre-existing bias in their training data. ChatGPT-4 excelled in context-specific accuracy after implementing the agentic approach, highlighting its potential to provide continuous, real-time support to vulnerable students.
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Oceania > Australia > Queensland > Brisbane (0.04)
- Oceania > Australia > Northern Territory > Darwin (0.04)
- (3 more...)
- Education > Educational Setting > K-12 Education (0.94)
- Education > Health & Safety > School Safety & Security > School Violence (0.48)
GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?
Ko, Dayoon, Kim, Jinyoung, Choi, Hahyeon, Kim, Gunhee
In the real world, knowledge is constantly evolving, which can render existing knowledge-based datasets outdated. This unreliability highlights the critical need for continuous updates to ensure both accuracy and relevance in knowledge-intensive tasks. To address this, we propose GrowOVER-QA and GrowOVER-Dialogue, dynamic open-domain QA and dialogue benchmarks that undergo a continuous cycle of updates, keeping pace with the rapid evolution of knowledge. Our research indicates that retrieval-augmented language models (RaLMs) struggle with knowledge that has not been trained on or recently updated. Consequently, we introduce a novel retrieval-interactive language model framework, where the language model evaluates and reflects on its answers for further re-retrieval. Our exhaustive experiments demonstrate that our training-free framework significantly improves upon existing methods, performing comparably to or even surpassing continuously trained language models.
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
- North America > United States > Mississippi (0.04)
- Oceania > Marshall Islands > Ratak Chain > Majuro Atoll > Majuro (0.04)
- (30 more...)
- Leisure & Entertainment > Sports > Olympic Games (0.94)
- Government > Regional Government (0.93)
- Leisure & Entertainment > Sports > Soccer (0.69)
- (4 more...)
FWin transformer for dengue prediction under climate and ocean influence
Tran, Nhat Thanh, Xin, Jack, Zhou, Guofa
Dengue fever is one of the most deadly mosquito-born tropical infectious diseases. Detailed long range forecast model is vital in controlling the spread of disease and making mitigation efforts. In this study, we examine methods used to forecast dengue cases for long range predictions. The dataset consists of local climate/weather in addition to global climate indicators of Singapore from 2000 to 2019. We utilize newly developed deep neural networks to learn the intricate relationship between the features. The baseline models in this study are in the class of recent transformers for long sequence forecasting tasks. We found that a Fourier mixed window attention (FWin) based transformer performed the best in terms of both the mean square error and the maximum absolute error on the long range dengue forecast up to 60 weeks.
- Asia > Singapore (0.25)
- North America > United States > California > Orange County > Irvine (0.14)
- Indian Ocean (0.05)
- (8 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Quality (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Open-Source Ground-based Sky Image Datasets for Very Short-term Solar Forecasting, Cloud Analysis and Modeling: A Comprehensive Survey
Nie, Yuhao, Li, Xiatong, Paletta, Quentin, Aragon, Max, Scott, Andea, Brandt, Adam
Sky-image-based solar forecasting using deep learning has been recognized as a promising approach in reducing the uncertainty in solar power generation. However, one of the biggest challenges is the lack of massive and diversified sky image samples. In this study, we present a comprehensive survey of open-source ground-based sky image datasets for very short-term solar forecasting (i.e., forecasting horizon less than 30 minutes), as well as related research areas which can potentially help improve solar forecasting methods, including cloud segmentation, cloud classification and cloud motion prediction. We first identify 72 open-source sky image datasets that satisfy the needs of machine/deep learning. Then a database of information about various aspects of the identified datasets is constructed. To evaluate each surveyed datasets, we further develop a multi-criteria ranking system based on 8 dimensions of the datasets which could have important impacts on usage of the data. Finally, we provide insights on the usage of these datasets for different applications. We hope this paper can provide an overview for researchers who are looking for datasets for very short-term solar forecasting and related areas.
- North America > United States > Colorado > Jefferson County > Golden (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- (72 more...)
- Overview (1.00)
- Research Report > New Finding (0.34)
- Energy > Renewable > Solar (1.00)
- Energy > Power Industry (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)